Binary versus Real-valued Reward Functions under Coevolutionary Reinforcement Learning
نویسندگان
چکیده
Models of coevolution supporting competitive and cooperative behaviors can be used to decompose the problem while scaling to large environmental state spaces. This work examines the significance of various design decisions that impact the deployment of a distinctionbased formulation of competitive coevolution. Specifically, competitive coevolutionary formulations with and without point population speciation are compared to stochastic sampling of the environment under both binary and real-valued rewards. The additional structure implicit in the competitive coevolutionary models is shown to be of significant benefit under binary rewards, however, stochastic sampling results in more dependable performance under real-valued feedback. It is also observed that cooperation between multiple solutions is much more prevalent under real-valued rewards than under binary rewards.
منابع مشابه
Maximal width learning of binary functions
This paper concerns learning binary-valued functions defined on IR, and investigates how a particular type of ‘regularity’ of hypotheses can be used to obtain better generalization error bounds. We derive error bounds that depend on the sample width (a notion similar to that of sample margin for real-valued functions). This motivates learning algorithms that seek to maximize sample width.
متن کاملGradient-Based Learning Updates Improve XCS Performance in Multistep Problems
This paper introduces a gradient-based reward prediction update mechanism to the XCS classifier system as applied in neuralnetwork type learning and function approximation mechanisms. A strong relation of XCS to tabular reinforcement learning and more importantly to neural-based reinforcement learning techniques is drawn. The resulting gradient-based XCS system learns more stable and reliable i...
متن کاملAssociative Reinforcement Learning of Real-valued Functions
|Associative reinforcement learning (ARL) tasks de ned originally by Barto and Anandan [1] combine elements of problems involving optimization under uncertainty, studied by learning automata theorists, and supervised learning pattern-classi cation. The stochastic real-valued (SRV) unit algorithm [6] has been designed for an extended version of ARL tasks wherein the learning system's outputs can...
متن کاملDynamic Obstacle Avoidance by Distributed Algorithm based on Reinforcement Learning (RESEARCH NOTE)
In this paper we focus on the application of reinforcement learning to obstacle avoidance in dynamic Environments in wireless sensor networks. A distributed algorithm based on reinforcement learning is developed for sensor networks to guide mobile robot through the dynamic obstacles. The sensor network models the danger of the area under coverage as obstacles, and has the property of adoption o...
متن کاملA Policy Search Method For Temporal Logic Specified Reinforcement Learning Tasks
Reward engineering is an important aspect of reinforcement learning. Whether or not the users’ intentions can be correctly encapsulated in the reward function can significantly impact the learning outcome. Current methods rely on manually crafted reward functions that often requires parameter tuning to obtain the desired behavior. This operation can be expensive when exploration requires system...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009